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What is Claimed is : 

1 . A method of identifying conditional associations among a plurality of 
features in a plurality of samples, the method comprising: 

defining a matrix having a plurality of rows that represent the plurality of 
samples and a plurality of columns that represent the plurality of features, each row- 
column position of the matrix having a first binary value if the sample that is 
associated with the row exhibits the feature that is associated with the column and a 
second binary value if the sample that is associated with the row does not exhibit the 
feature that is associated with the column; and 

for each column, recursively partitioning the column relative to remaining 
ones of the columns to define a tree of conditional branches for the rows for each 
column. 

2. A method according to Claim 1 wherein the recursively partitioning is 
followed by: 

analyzing the trees of conditional branches for the columns to identify the 
conditional associations. 

3. A method according to Claim 1 wherein the recursively partitioning is 
followed by: 

displaying the trees of conditional branches for the columns to identify the 
conditional associations. 

4. A method according to Claim 1 wherein the recursively partitioning 
comprises the following that are performed for each column: 

for the column, comparing a number of occurrences of the first binary value 
in both the column and in each of the remaining columns to define a score for each of 
the remaining columns; 

selecting one of the remaining columns based upon the scores; 

dividing the rows that are associated with the one of the remaining columns 
based on whether the first value or the second value is present in the rows, to thereby 
obtain two sub-matrices and two corresponding branches of a tree; and 
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repeatedly performing the comparing, selecting and dividing for the columns 
of each of the sub matrices that are associated with the two branches of the tree to 
obtain remaining branches of the tree. 

5 . A method according to Claim 4 wherein the selecting comprises 
selecting one of the remaining columns that has a maximum score. 

6. A method according to Claim 4 wherein the selecting comprises 
selecting one of the remaining columns based upon the scores and auxiliary 
information concerning the samples. 

7. A method according to Claim 4 wherein the repeatedly performing 
comprises repeatedly performing the comparing, selecting and dividing of the rows of 
each sub-matrix until a predefined termination is reached. 

8. A method according to Claim 7 wherein the predefined termination 
comprises at least one of the scores in the remaining columns being less than a 
predetermined score, the number of rows in a sub-matrix being less than a 
predetermined number and the tree having a predetermined depth. 

9. A method according to Claim 4 wherein the comparing a number of 
occurrences of the first binary value in both the column and in each of the remaining 
columns to define a score for each of the remaining columns comprises comparing a 
number of occurrences of the first binary value in both the column and in each of the 
remaining columns using at least one of a Pearson chi-square, likelihood ratio statistic 
and measure of agreement metric. 

10. A method according to Claim 1 wherein the samples are biological 
samples, the features are genes, the first binary value indicates that the gene is 
expressed in the biological sample and the second binary value indicates that the gene 
is not expressed in the biological sample. 



PU4070US2 



21 

11. A method of identifying conditional associations among a plurality of 
features in a plurality of samples, the method comprising: 

defining a matrix having a plurality of rows that represent the plurality of 
samples and a plurality of columns that represent the plurality of features, each row- 
column position of the matrix having a value selected from a continuous range of 
values that indicates an amount that the sample that is associated with the row exhibits 
the feature that is associated with the column; and 

for each column, recursively partitioning the column relative to remaining 
ones of the columns to define a tree of conditional branches for the rows for each 
column. 

12. A method according to Claim 1 1 wherein the recursively partitioning 
comprises the following that are performed for each column: 

for the column, identifying an association between the column and each of the 
remaining columns to define a score for each of the remaining columns; 

selecting one of the remaining columns based upon the scores; 

dividing the rows that are associated with the one of the remaining columns 
based on range partitions of the values in the rows to thereby obtain at least two sub- 
matrices and at least two corresponding branches of a tree; and 

repeatedly performing the comparing, selecting and dividing for the columns 
of each of the sub matrices that are associated with the at least two branches of the 
tree to obtain remaining branches of the tree. 

13. A method according to Claim 12 wherein the selecting comprises 
selecting one of the remaining columns that has a highest correlation coefficient with 
the column. 

14. A method according to Claim 1 1 wherein the samples are biological 
samples, the features are continuous traits of the biological samples and the value 
indicates an amount that the biological sample that is associated with the row exhibits 
the continuous trait. 
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15. A method of identifying associations among a plurality of features in a 
plurality of samples, the method comprising: 

generating at least two trees of conditional branches for a corresponding at 
least two of the features, each tree of conditional branches indicating conditional 
associations for a corresponding feature relative to remaining ones of the plurality of 
features. 

16. A method according to Claim 15 wherein the samples are biological 
samples and the features are discrete traits of the biological samples. 

1 7. A method according to Claim 1 5 wherein the samples are biological 
samples and the features are continuous traits of the biological samples. 

18. A system for identifying conditional associations among a plurality of 
features in a plurality of samples, the system comprising: 

a matrix having a plurality of rows that represent the plurality of samples and a 
plurality of columns that represent the plurality of features, each row-column position 
of the matrix having a first binary value if the sample that is associated with the row 
exhibits the feature that is associated with the column and a second binary value if the 
sample that is associated with the row does not exhibit the feature that is associated 
with the column; and 

means for recursively partitioning each column relative to remaining ones of 
the columns to define a tree of conditional branches for the rows for each column. 

19. A system according to Claim 1 8 further comprising: 

means for analyzing the trees of conditional branches for the columns to 
identify the conditional associations. 

20. A system according to Claim 18 further comprising: 
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means for displaying the trees of conditional branches for the columns to 
identify the conditional associations. 

21. A system according to Claim 1 8 wherein the means for recursively 
partitioning comprises: 

means for comparing a number of occurrences of the first binary value in both 
the column and in each of the remaining columns to define a score for each of the 
remaining columns; 

means for selecting one of the remaining columns based upon the scores; 

means for dividing the rows that are associated with the one of the remaining 
columns based on whether the first value or the second value is present in the rows, to 
thereby obtain two sub-matrices and two corresponding branches of a tree; and 

means for repeatedly activating the means for comparing, the means for 
selecting and the means for dividing for the columns of each of the sub matrices that 
are associated with the two branches of the tree to obtain remaining branches of the 
tree. 

22. A system according to Claim 21 wherein the means for selecting 
comprises means for selecting one of the remaining columns that has a maximum 
score. 

23. A system according to Claim 21 wherein the means for selecting 
comprises means for selecting one of the remaining columns based upon the scores 
and auxiliary information concerning the samples. 

24. A system according to Claim 21 wherein the means for repeatedly 
activating comprises means for repeatedly activating the means for comparing, the 
means for selecting and the means for dividing for the rows of each sub-matrix until a 
predefined termination is reached. 

25 . A system according to Claim 24 wherein the predefined termination 
comprises at least one of the scores in the remaining columns being less than a 
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predetermined score, the number of rows in a sub-matrix being less than a 
predetermined number and the tree having a predetermined depth. 

26. A system according to Claim 21 wherein the means for comparing a 
number of occurrences of the first binary value in both the column and in each of the 
remaining columns to define a score for each of the remaining columns comprises 
means for comparing a number of occurrences of the first binary value in both the 
column and in each of the remaining columns using at least one of a Pearson chi- 
square, likelihood ratio statistic and measure of agreement metric. 

27. A system according to Claim 1 8 wherein the samples are biological 
samples, the features are genes, the first binary value indicates that the gene is 
expressed in the biological sample and the second binary value indicates that the gene 
is not expressed in the biological sample. 

28. A system for identifying conditional associations among a plurality of 
features in a plurality of samples, the system comprising: 

a matrix having a plurality of rows that represent the plurality of samples and a 
plurality of columns that represent the plurality of features, each row-column position 
of the matrix having a value selected from a continuous range of values that indicates 
an amount that the sample that is associated with the row exhibits the feature that is 
associated with the column; and 

means for recursively partitioning each column relative to remaining ones of 
the columns to define a tree of conditional branches for the rows for each column. 

29. A system according to Claim 28 wherein the means for recursively 
partitioning comprises: 

means for identifying an association between the column and each of the 
remaining columns to define a score for each of the remaining columns; 

means for selecting one of the remaining columns based upon the scores; 
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means for dividing the rows that are associated with the one of the remaining 
columns based on range partitions of the values in the rows to thereby obtain at least 
two sub-matrices and at least two corresponding branches of a tree; and 

means for repeatedly activating the means for comparing, the means for 
selecting and the means for dividing for the columns of each of the sub matrices that 
are associated with the at least two branches of the tree to obtain remaining branches 
of the tree. 

30. A system according to Claim 29 wherein the means for selecting 
comprises means for selecting one of the remaining columns that has a highest 
correlation coefficient with the column. 

31. A system according to Claim 28 wherein the samples are biological 
samples, the features are continuous traits of the biological samples and the value 
indicates an amount that the biological sample that is associated with the row exhibits 
the continuous trait. 

32. A system for identifying associations among a plurality of features in a 
plurality of samples, the system comprising: 

means for generating at least two trees of conditional branches for a 
corresponding at least two of the features, each tree of conditional branches indicating 
conditional associations for a corresponding feature relative to remaining ones of the 
plurality of features; and 

means for displaying the at least two trees of conditional branches. 

33 . A system according to Claim 32 wherein the samples are biological 
samples and the features are discrete traits of the biological samples. 

34. A system according to Claim 32 wherein the samples are biological 
samples and the features are continuous traits of the biological samples. 
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35. A computer program product that identifies conditional associations 
among a plurality of features in a plurality of samples, the computer program product 
comprising a computer usable storage medium having computer-readable program 
code embodied in the medium, the computer-readable program code comprising: 

computer-readable program code that is configured to define a matrix having a 
plurality of rows that represent the plurality of samples and a plurality of columns that 
represent the plurality of features, each row-column position of the matrix having a 
first binary value if the sample that is associated with the row exhibits the feature that 
is associated with the column and a second binary value if the sample that is 
associated with the row does not exhibit the feature that is associated with the column; 
and 

computer-readable program code that is configured to recursively partition 
each column relative to remaining ones of the columns to define a tree of conditional 
branches for the rows for each column. 

36. A computer program product according to Claim 35 further 
comprising: 

computer-readable program code that is configured to analyze the trees of 
conditional branches for the columns to identify the conditional associations. 

37. A computer program product according to Claim 35 further 
comprising: 

computer-readable program code that is configured to display the trees of 
conditional branches for the columns to identify the conditional associations. 

38. A computer program product according to Claim 3 5 wherein the 
computer-readable program code that is configured to recursively partition comprises: 

computer-readable program code that is configured to compare a number of 
occurrences of the first binary value in both the column and in each of the remaining 
columns to define a score for each of the remaining columns; 
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computer-readable program code that is configured to select one of the 
remaining columns based upon the scores; 

computer-readable program code that is configured to divide the rows that are 
associated with the one of the remaining columns based on whether the first value or 
the second value is present in the rows, to thereby obtain two sub-matrices and two 
corresponding branches of a tree; and 

computer-readable program code that is configured to repeatedly activate the 
computer-readable program code that is configured to compare, the computer- 
readable program code that is configured to select and the computer-readable program 
code that is configured to divide for the columns of each of the sub matrices that are 
associated with the two branches of the tree to obtain remaining branches of the tree. 

39. A computer program product according to Claim 38 wherein the 
computer-readable program code that is configured to select comprises computer- 
readable program code that is configured to select one of the remaining columns that 
has a maximum score. 

40. A computer program product according to Claim 3 8 wherein the 
computer-readable program code that is configured to select comprises computer- 
readable program code that is configured to select one of the remaining columns 
based upon the scores and auxiliary information concerning the samples. 

41 . A computer program product according to Claim 38 wherein the 
computer-readable program code that is configured to repeatedly activate comprises 
computer-readable program code that is configured to repeatedly activate the 
computer-readable program code that is configured to compare, the computer- 
readable program code that is configured to select and the computer-readable program 
code that is configured to divide rows of each sub-matrix until a predefined 
termination is reached. 



42. A computer program product according to Claim 41 wherein the 
predefined termination comprises at least one of the scores in the remaining columns 
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being less than a predetermined score, the number of rows in a sub-matrix being less 
than a predetermined number and the tree having a predetermined depth. 

43. A computer program product according to Claim 38 wherein the 
computer-readable program code that is configured to compare a number of 
occurrences of the first binary value in both the column and in each of the remaining 
columns to define a score for each of the remaining columns comprises computer- 
readable program code that is configured to compare a number of occurrences of the 
first binary value in both the column and in each of the remaining columns using at 
least one of a Pearson chi-square, likelihood ratio statistic and measure of agreement 
metric. 

44. A computer program product according to Claim 35 wherein the 
samples are biological samples, the features are genes, the first binary value indicates 
that the gene is expressed in the biological sample and the second binary value 
indicates that the gene is not expressed in the biological sample. 

45 . A computer program product that identifies conditional associations 
among a plurality of features in a plurality of samples, the computer program product 
comprising a computer usable storage medium having computer-readable program 
code embodied in the medium, the computer-readable program code comprising: 

computer-readable program code that is configured to define a matrix having a 
plurality of rows that represent the plurality of samples and a plurality of columns that 
represent the plurality of features, each row-column position of the matrix having a 
value selected from a continuous range of values that indicates an amount that the 
sample that is associated with the row exhibits the feature that is associated with the 
column; and 

computer-readable program code that is configured to recursively partition 
each column relative to remaining ones of the columns to define a tree of conditional 
branches for the rows for each column. 
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46. A computer program product according to Claim 45 wherein the 
computer-readable program code that is configured to recursively partition comprises: 

computer-readable program code that is configured to identify an association 
between the column and each of the remaining columns to define a score for each of 
the remaining columns; 

computer-readable program code that is configured to select one of the 
remaining columns based upon the scores; 

computer-readable program code that is configured to divide the rows that are 
associated with the one of the remaining columns based on range partitions of the 
values in the rows to thereby obtain at least two sub-matrices and at least two 
corresponding branches of a tree; and 

computer-readable program code that is configured to repeatedly activate the 
computer-readable program code that is configured to compare, the computer- 
readable program code that is configured to select and the computer-readable program 
code that is configured to divide for the columns of each of the sub matrices that are 
associated with the at least two branches of the tree to obtain remaining branches of 
the tree. 

47. A computer program product according to Claim 46 wherein the 
computer-readable program code that is configured to select comprises computer- 
readable program code that is configured to select one of the remaining columns that 
has a highest correlation coefficient with the column. 

48. A computer program product according to Claim 45 wherein the 
samples are biological samples, the features are continuous traits of the biological 
samples and the value indicates an amount that the biological sample that is 
associated with the row exhibits the continuous trait. 

49. A computer program product that identifies associations among a 
plurality of features in a plurality of samples, the computer program product 
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comprising a computer usable storage medium having computer-readable program 
code embodied in the medium, the computer-readable program code comprising: 

computer-readable program code that is configured to generate at least two 
trees of conditional branches for a corresponding at least two of the features, each tree 
of conditional branches indicating conditional associations for a corresponding feature 
relative to remaining ones of the plurality of features. 

50. A computer program product according to Claim 49 wherein the 
samples are biological samples and the features are discrete traits of the biological 
samples. 

51. A computer program product according to Claim 49 wherein the 
samples are biological samples and the features are continuous traits of the biological 
samples. 



